Introduction

This report aims at providing an overview of the Data Source and the Analysis that can be expected from it to answer certain questions about the data. The data used for the IDA final group assessment task is downloaded from the R package “cricketdata” from the following link - Cricket Data Source. Data on all international cricket matches is provided by ESPNCricinfo. This package provides some scraper functions to download the data into tibbles ready for analysis.

Case Study -

In the midst of the epidemic, 4 friends started a discussion about a common sport, Cricket, with Rahul being the die-hard fan, Aarathy, a fan limited to popular opinion and social trends, Soban who only knew as much as the grapevines allowed, while Ed was hearing about it as a revelation.

Cricket is an international sport watched throughout the world, played on every continent, with fans in every country. Internationally cricket matches are played as a series of test matches, ODI’s, T20s and World cup.

Talking about this had Rahul on edge, Aarathy kept using Instagram followers and likes on photographs to tell which cricketer is better. Soban, on the other hand, used memes and grapevines to justify his choice of players in each category. Ed went through google and youtube to understand the game process and found some dataset about player stats around the world. This did not make sense to Rahul, he was using cold hard facts to support his choice but he had no output display that was very convincing to the others. Rahul had a sense of determination, knowledge of using data and a platform to make it visually appealing.

Rahul started using cricket data to explore the performances of players and in ODI cricket. Who is on top? Who has the best stats? WHY? What are the best teams for fantasy cricket? What is the “all-time best eleven” for cricket?

To answer these questions and finally, individually assess each selected member of the “all-time best eleven” team is the goal!

install.packages("devtools")
devtools::install_github("ropenscilabs/cricketdata")
cric_bat_data <- fetch_cricinfo("ODI", "Men", "Batting", "Career") %>% select(-Country)
cric_bowl_data <- fetch_cricinfo("ODI", "Men", "Bowling", "Career") %>% select(-Country)
cric_field_data <- fetch_cricinfo("ODI", "Men", "Fielding", "Career") %>% select(-Country)
write.csv(cric_bat_data, file = "Batting.csv")
write.csv(cric_bowl_data, file = "Bowling.csv")
write.csv(cric_field_data, file = "Fielding.csv")
tidy_data <- function(data) {
  separate(data,
           col = "Player",
           into = c("Player", "Region"),
           sep = "[*(*]"
)
}

bat_data <- tidy_data(cric_bat_data) %>% select(-Region)
bowl_data <- tidy_data(cric_bowl_data) %>% select(-Region)
field_data <- tidy_data(cric_field_data) %>% select(-Region)

A Sample Analysis Plot planned -

Individual Player Analysis -